DAFx Paper Archive - Browse all papers by Fazekas, G.

Novel methods in Information Management for Advanced Audio Workflows

DAFx-2009 - Como

This paper discusses architectural aspects of a software library for unified metadata management in audio processing applications. The data incorporates editorial, production, acoustical and musicological features for a variety of use cases, ranging from adaptive audio effects to alternative metadata based visualisation. Our system is designed to capture information, prescribed by modular ontology schema. This advocates the development of intelligent user interfaces and advanced media workflows in music production environments. In an effort to reach these goals, we argue for the need of modularity and interoperable semantics in representing information. We discuss the advantages of extensible Semantic Web ontologies as opposed to using specialised but disharmonious metadata formats. Concepts and techniques permitting seamless integration with existing audio production software are described in detail.

Download

Towards Ontological Representations of Digital Audio Effects

Thomas Wilmering; György Fazekas; Mark B. Sandler

DAFx-2011 - Paris

In this paper we discuss the development of ontological representations of digital audio effects and provide a framework for the description of digital audio effects and audio effect transformations. After a brief account on our current research in the field of highlevel semantics for music production using Semantic Web technologies, we detail how an Audio Effects Ontology can be used within the context of intelligent music production tools, as well as for musicological purposes. Furthermore, we discuss problems in the design of such an ontology arising from discipline-specific classifications, such as the need for encoding different taxonomical systems based on, for instance, implementation techniques or perceptual attributes of audio effects. Finally, we show how information about audio effect transformations is represented using Semantic Web technologies, the Resource Description framework (RDF) and retrieved using the SPARQL query language.

Download

Automatic Control of the Dynamic Range Compressor Using a Regression Model and a Reference Sound

Di Sheng; György Fazekas

DAFx-2017 - Edinburgh

Practical experience with audio effects as well as knowledge of their parameters and how they change the sound is crucial when controlling digital audio effects. This often presents barriers for musicians and casual users in the application of effects. These users are more accustomed to describing the desired sound verbally or using examples, rather than understanding and configuring low-level signal processing parameters. This paper addresses this issue by providing a novel control method for audio effects. While a significant body of works focus on the use of semantic descriptors and visual interfaces, little attention has been given to an important modality, the use of sound examples to control effects. We use a set of acoustic features to capture important characteristics of sound examples and evaluate different regression models that map these features to effect control parameters. Focusing on dynamic range compression, results show that our approach provides a promising first step in this direction.

Download

Differentiable Time–frequency Scattering on GPU

John Muradeli; Cyrus Vahidi; Changhong Wang; Han Han; Vincent Lostanlen; Mathieu Lagrange; George Fazekas

DAFx-2022 - Vienna

Joint time–frequency scattering (JTFS) is a convolutional operator in the time–frequency domain which extracts spectrotemporal modulations at various rates and scales. It offers an idealized model of spectrotemporal receptive fields (STRF) in the primary auditory cortex, and thus may serve as a biological plausible surrogate for human perceptual judgments at the scale of isolated audio events. Yet, prior implementations of JTFS and STRF have remained outside of the standard toolkit of perceptual similarity measures and evaluation methods for audio generation. We trace this issue down to three limitations: differentiability, speed, and flexibility. In this paper, we present an implementation of time–frequency scattering in Python. Unlike prior implementations, ours accommodates NumPy, PyTorch, and TensorFlow as backends and is thus portable on both CPU and GPU. We demonstrate the usefulness of JTFS via three applications: unsupervised manifold learning of spectrotemporal modulations, supervised classification of musical instruments, and texture resynthesis of bioacoustic sounds.

Download

Years

Authors